Applications of maximum likelihood principal component analysis: incomplete data sets and calibration transfer

نویسندگان

  • Darren T. Andrews
  • Peter D. Wentzell
چکیده

The application of a new method to the multivariate analysis of incomplete data sets is described. The new method, called maximum likelihood principal component analysis (MLPCA), is analogous to conventional principal component analysis (PCA), but incorporates measurement error variance information in the decomposition of multivariate data. Missing measurements can be handled in a reliable and simple manner by assigning large measurement uncertainties to them. The problem of missing data is pervasive in chemistry, and MLPCA is applied to three sets of experimental data to illustrate its utility. For exploratory data analysis, a data set from the analysis of archeological artifacts is used to show that the principal components extracted by MLPCA retain much of the original information even when a significant number of measurements are missing. Maximum likelihood projections of censored data can often preserve original clusters among the samples and can, through the propagation of error, indicate which samples are likely to be projected erroneously. To demonstrate its utility in modeling applications, MLPCA is also applied in the development of a model for chromatographic retention based on a data set which is only 80% complete. MLPCA can predict missing values and assign error estimates to these points. Finally, the problem of calibration transfer between instruments can be regarded as a missing data problem in which entire spectra are missing on the ‘slave’ instrument. Using NIR spectra obtained from two instruments, it is shown that spectra on the slave instrument can be predicted from a small subset of calibration transfer samples even if a different wavelength range is employed. Concentration prediction errors obtained by this approach were comparable to cross-validation errors obtained for the slave instrument when all spectra were available.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum likelihood multivariate calibration.

Two new approaches to multivariate calibration are described that, for the first time, allow information on measurement uncertainties to be included in the calibration process in a statistically meaningful way. The new methods, referred to as maximum likelihood principal components regression (MLPCR) and maximum likelihood latent root regression (MLLRR), are based on principles of maximum likel...

متن کامل

Derivative Preprocessing and Optimal Corrections for Baseline Drift in Multivariate Calibration

The characteristics of baseline drift are discussed from the perspective of error covariance. From this standpoint, the operation of derivative ® lters as preprocessing tools for multivariate calibration is explored. It is shown that convolution of derivative ® lter coef® cients with the error covariance matrices for the data tend to reduce the contributions of correlated error, thereby reducin...

متن کامل

Application of Maximum Likelihood Principal Components Regression to Fluorescence Emission Spectra

The application of maximum likelihood multivariate calibration methods to the  uorescence emission spectra of mixtures of acenaphthylene, naphthalene, and phenanthrene in acetonitrile is described. Maximum likelihood principal components regression (MLPCR) takes into account the measurement error structure in the spectral data in constructing the calibration model. Measurement errors for the ...

متن کامل

Feature Dimension Reduction of Multisensor Data Fusion using Principal Component Fuzzy Analysis

These days, the most important areas of research in many different applications, with different tools, are focused on how to get awareness. One of the serious applications is the awareness of the behavior and activities of patients. The importance is due to the need of ubiquitous medical care for individuals. That the doctor knows the patient's physical condition, sometimes is very important. O...

متن کامل

Representing Spectral data using LabPQR color space in comparison to PCA method

In many applications of color technology such as spectral color reproduction it is of interest to represent the spectral data with lower dimensions than spectral space’s dimensions. It is more than half of a century that Principal Component Analysis PCA method has been applied to find the number of independent basis vectors of spectral dataset and representing spectral reflectance with lower di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003